3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images Supplementary Material

نویسندگان

  • Liuhao Ge
  • Hui Liang
  • Junsong Yuan
  • Daniel Thalmann
چکیده

We present all the 3D CNN models in the experiment section. We experiment with projective D-TSDF volumes with different resolution values: 16, 32 and 64. Figure 1a presents the network architecture when the input is projective D-TSDF volumes with 32×32×32 resolution. We use this 3D CNN model to compare with state-of-the-art methods on the MSRA dataset [2] and the NYU dataset [3]. However, when the volume resolution is 16×16×16 or 64×64×64, the network architecture is different with that in Figure 1a. Figure 1b presents the network architecture when the input is projective D-TSDF volumes with 16×16×16 resolution. We reduce a convolutional layer and a max pooling layer in this network. Figure 1c presents the network architecture when the input is projective D-TSDF volumes with 64×64×64 resolution. We add a convolutional layer and a max pooling layer in this network. We also experiment with different TSDF types: accurate TSDF, projective TSDF and projective D-TSDF. When the input volume is accurate/projective TSDF which has only one channel, the parameters of the network architecture in Figure 1a should be modified to adapt to the input with one channel. Figure 1d presents the network architecture when the input is accurate/projective TSDF volumes with 32×32×32 resolution. Since the number of input channel is 1 instead of 3, we divide the numbers of output channels for the convolutional layers by 3.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study

Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...

متن کامل

Estimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks

Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. T...

متن کامل

V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map

Most of the existing deep learning-based methods for 3D hand and human pose estimation from a single depth map are based on a common framework that takes a 2D depth map and directly regresses the 3D coordinates of keypoints, such as hand or human body joints, via 2D convolutional neural networks (CNNs). The first weakness of this approach is the presence of perspective distortion in the 2D dept...

متن کامل

Feedback Loop and Accurate Training Data for 3D Hand Pose Estimation†

In this work, we present an entirely data-driven approach to estimating the 3D pose of a hand given a depth image. We show that we can correct the mistakes made by a Convolutional Neural Network (CNN) trained to predict an estimate of the 3D pose by using a feedback loop of Deep Networks, also utilizing a CNN architecture. Since this approach critically relies on a training set of labeled frame...

متن کامل

Head Pose Estimation Using Convolutional Neural Networks

Detection and estimation of head pose is fundamental problem in many applications such as automatic face recognition, intelligent surveillance, and perceptual human-computer interface and in an application like driving, the pose of the driver is used to estimate his gaze and alertness, where faces in the images are non-frontal with various poses. In this work head pose of the person is used to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017